2024-04-15 08:30:17
This https://arxiv.org/abs/2401.08772 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
This https://arxiv.org/abs/2401.08772 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
This https://arxiv.org/abs/2312.03853 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…
Der Ausreden-Generator erzeugt einen Text, der wirklich von #Wissing sein könnte. Andere fürchten, eine ausgefeilte KI könnte sie ersetzen, beim #Verkehrsminister reicht dazu sogar ein einfacher Chat-Bot.
generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation
Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias St\"ahle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady
https://arxiv.org/abs/2403.07627
This https://arxiv.org/abs/2312.03853 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…
This https://arxiv.org/abs/2402.12749 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
Towards Reliable and Empathetic Depression-Diagnosis-Oriented Chats
Kunyao Lan, Cong Ming, Binwei Yao, Lu Chen, Mengyue Wu
https://arxiv.org/abs/2404.05012
NYC's Microsoft-powered MyCity chatbot, launched as a pilot program last October, often gives inaccurate info, including telling businesses to break the law (Colin Lecher/The City)
https://www.thecity.nyc/2024/03/29/ai-chat-false-information-small-…
ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models
Jun Xu, Mengshu Sun, Zhiqiang Zhang, Jun Zhou
https://arxiv.org/abs/2403.05132
From @…
NYC AI Chatbot Touted by Adams Tells Businesses to Break the Law https://www.thecity.nyc/2024/03/29/ai-chat-false-…
FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and Scientific Use
Ingo Weber, Hendrik Linka, Daniel Mertens, Tamara Muryshkin, Heinrich Opgenoorth, Stefan Langer
https://arxiv.org/abs/2403.00039
New York City’s new #AI chatbot (predictably) sometimes gives bad, law-breaking advice to residents’ queries about how to run their businesses.
Q: What’s worse than a bot that always gives bad advice?
A: A bot that *sometimes* gives bad advice.
“NYC AI Chatbot Touted by Adams Tells Businesses to Break the Law”
The Use of Generative Search Engines for Knowledge Work and Complex Tasks
Siddharth Suri, Scott Counts, Leijie Wang, Chacha Chen, Mengting Wan, Tara Safavi, Jennifer Neville, Chirag Shah, Ryen W. White, Reid Andersen, Georg Buscher, Sathish Manivannan, Nagu Rangan, Longqi Yang
https://arxiv.org/abs/2404.04268
#AlaskaAirlines Fun Facts
ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models
Jun Xu, Mengshu Sun, Zhiqiang Zhang, Jun Zhou
https://arxiv.org/abs/2403.05132
GENEVIC: GENetic data Exploration and Visualization via Intelligent interactive Console
Anindita Nath (Center for Precision Health, McWilliams School of Biomedical Informatics, UT Health Houston, TX), Savannah Mwesigwa (Center for Precision Health, McWilliams School of Biomedical Informatics, UT Health Houston, TX), Yulin Dai (Center for Precision Health, McWilliams School of Biomedical Informatics, UT Health Houston, TX), Xiaoqian Jiang (Department of Health Data Science and Artificia…
Of all the baffling GenAI decisions out there, I am maybe most baffled by Snapchat's “My AI” chatbot.
• Takes up space at the top of my chat list, above friends I've chatted with within the past few days, despite the fact I haven't engaged with the bot in the near year it's been out
• Can only be hidden/removed if you pay for Snapchat's premium membership
Who is this for? Who's going to a social platform to talk with a chatbot? Is the platform so emp…
Understanding and Mitigating Harmful Design in User-Generated Virtual Worlds
Zinan Zhang, Xinning Gui, Yubo Kou
https://arxiv.org/abs/2405.05922 https://…
This https://arxiv.org/abs/2401.14040 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
Dynamic Scaling of Two-Dimensional Polar Flocks
Hugues Chat\'e, Alexandre Solon
https://arxiv.org/abs/2403.03804 https://arxiv.or…
New York City’s new #AI chatbot (predictably) sometimes gives bad, law-breaking advice to residents’ queries about how to run their businesses.
Q: What’s worse than a bot that always gives bad advice?
A: A bot that *sometimes* gives bad advice.
“NYC AI Chatbot Touted by Adams Tells Businesses to Break the Law”
Understanding and Mitigating Harmful Design in User-Generated Virtual Worlds
Zinan Zhang, Xinning Gui, Yubo Kou
https://arxiv.org/abs/2405.05922 https://…
Exploring Backdoor Vulnerabilities of Chat Models
Yunzhuo Hao, Wenkai Yang, Yankai Lin
https://arxiv.org/abs/2404.02406 https://arxiv…
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines
Lijia Ma, Xingchen Xu, Yong Tan
https://arxiv.org/abs/2402.19421 https:/…
Of all the baffling GenAI decisions out there, I am maybe most baffled by Snapchat's “My AI” chatbot.
• Takes up space at the top of my chat list, above friends I've chatted with within the past few days, despite the fact I haven't engaged with the bot in the near year it's been out
• Can only be hidden/removed if you pay for Snapchat's premium membership
Who is this for? Who's going to a social platform to talk with a chatbot? Is the platform so emp…
This https://arxiv.org/abs/2310.04799 has been replaced.
link: https://scholar.google.com/scholar?q=a
This https://arxiv.org/abs/2312.03853 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System
Zhiwei Liu, Weiran Yao, Jianguo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese
https://arxiv.org/abs/2402.15538
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Anisha Agarwal, Aaron Chan, Shubham Chandel, Jinu Jang, Shaun Miller, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Neel Sundaresan, Michele Tufano
https://arxiv.org/abs/2402.14261
This https://arxiv.org/abs/2312.06024 has been replaced.
link: https://scholar.google.com/scholar?q=a
This https://arxiv.org/abs/2401.15182 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
"Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time
Scott Rome, Tianwen Chen, Raphael Tang, Luwei Zhou, Ferhan Ture
https://arxiv.org/abs/2405.00801 <…
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng
https://arxiv.org/abs/2405.01470
This https://arxiv.org/abs/2402.14301 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csIR_…
This https://arxiv.org/abs/2401.15182 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
This https://arxiv.org/abs/2402.16107 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
GenSERP: Large Language Models for Whole Page Presentation
Zhenning Zhang, Yunan Zhang, Suyu Ge, Guangwei Weng, Mridu Narang, Xia Song, Saurabh Tiwary
https://arxiv.org/abs/2402.14301
This https://arxiv.org/abs/2403.13592 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…
Enhancing Depression-Diagnosis-Oriented Chat with Psychological State Tracking
Yiyang Gu, Yougen Zhou, Qin Chen, Ningning Zhou, Jie Zhou, Aimin Zhou, Liang He
https://arxiv.org/abs/2403.09717
VCounselor: A Psychological Intervention Chat Agent Based on a Knowledge-Enhanced Large Language Model
H. Zhang, Z. Qiao, H. Wang, B. Duan, J. Yin
https://arxiv.org/abs/2403.13553
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston
https://arxiv.org/abs/2404.19733 https://arxiv.org/pdf/2404.19733
arXiv:2404.19733v1 Announce Type: new
Abstract: Iterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks (Yuan et al., 2024, Chen et al., 2024). In this work we develop an iterative approach that optimizes the preference between competing generated Chain-of-Thought (CoT) candidates by optimizing for winning vs. losing reasoning steps that lead to the correct answer. We train using a modified DPO loss (Rafailov et al., 2023) with an additional negative log-likelihood term, which we find to be crucial. We show reasoning improves across repeated iterations of this scheme. While only relying on examples in the training set, our approach results in increasing accuracy for Llama-2-70B-Chat from 55.6% to 81.6% on GSM8K (and 88.7% with majority voting out of 32 samples), from 12.5% to 20.8% on MATH, and from 77.8% to 86.7% on ARC-Challenge, which outperforms other Llama-2-based models not relying on additionally sourced datasets.
Fact Checking Chatbot: A Misinformation Intervention for Instant Messaging Apps and an Analysis of Trust in the Fact Checkers
Gionnieve Lim, Simon T. Perrault
https://arxiv.org/abs/2403.12913